AITopics | target network

Collaborating Authors

target network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

Neural Information Processing SystemsMar-22-2026, 20:45:19 GMT

In reinforcement learning (RL), it is common to apply techniques used broadly in machine learning such as neural network function approximators and momentum-based optimizers. However, such tools were largely developed for supervised learning rather than nonstationary RL, leading practitioners to adopt target networks, clipped policy updates, and other RL-specific implementation tricks to combat this mismatch, rather than directly adapting this toolchain for use in RL. In this paper, we take a different approach and instead address the effect of nonstationarity by adapting the widely used Adam optimiser. We first analyse the impact of nonstationary gradient magnitude --- such as that caused by a change in target network --- on Adam's update size, demonstrating that such a change can lead to large updates and hence sub-optimal performance.To address this, we introduce Adam-Rel.Rather than using the global timestep in the Adam update, Adam-Rel uses the timestep within an epoch, essentially resetting Adam's timestep to 0 after target changes.We demonstrate that this avoids large updates and reduces to learning rate annealing in the absence of such increases in gradient magnitude. Evaluating Adam-Rel in both on-policy and off-policy RL, we demonstrate improved performance in both Atari and Craftax.We then show that increases in gradient norm occur in RL in practice, and examine the differences between our theoretical model and the observed data.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reconciling λ-Returns with Experience Replay

Brett Daley, Christopher Amato

Neural Information Processing SystemsFeb-13-2026, 06:41:28 GMT

A unique benefit to this approach is that each transition's TD error can be

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Deliberative Explanations: visualizing network insecurities

Pei Wang, Nuno Nvasconcelos

Neural Information Processing SystemsFeb-12-2026, 10:57:30 GMT

The explanation consists of a list of insecurities, each composed of 1) an image region (more generally, a set of input variables), and 2) an ambiguity formed by the pair of classes responsible for the network uncertainty about the region.

artificial intelligence, explanation, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

Zeyuan Allen-Zhu, Yuanzhi Li, Yingyu Liang

Neural Information Processing SystemsFeb-12-2026, 09:22:12 GMT

The fundamental learning theory behind neural networks remains largely open. What classes of functions can neural networks actually learn?

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

fd06b8ea02fe5b1c2496fe1700e9d16c-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 01:12:23 GMT

agent, confidence interval, experiment, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

The Elastic Lottery Ticket Hypothesis

Neural Information Processing SystemsFeb-11-2026, 13:15:51 GMT

Work was done when the author interned at Microsoft.

artificial intelligence, machine learning, ticket, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Contests & Prizes (0.60)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Gambling (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Prioritizing Samples in Reinforcement Learning with Reducible Loss

Neural Information Processing SystemsFeb-11-2026, 08:36:46 GMT

Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed in the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naive strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
South America > Brazil > Pernambuco (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

f3ada80d5c4ee70142b17b8192b2958e-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 02:36:04 GMT

First, a random patch of the image is selected and resized to224 224 with a random horizontal flip, followed byacolor distortion, consisting ofarandom sequence ofbrightness, contrast, saturation, hue adjustments, and anoptional grayscale conversion. FinallyGaussian blur and solarization are appliedtothepatches. Optimization We use theLARS optimizer [70] with a cosine decay learning rate schedule [71], without restarts, over1000epochs, with awarm-up period of10epochs. Wesetthebase learning rate to 0.2, scaled linearly [72] with the batch size (LearningRate = 0.2 BatchSize/256). Forthetargetnetwork,the exponential moving average parameterτ starts fromτbase = 0.996and is increased to one during training.

artificial intelligence, augmentation, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback